智能论文笔记

可以通过组合自动语音识别（ASR）和文本摘要（TS）来实现来自语音的文本摘要的语音摘要。通过这种级联方法，我们可以利用最先进的模型和大型训练数据集，用于两个子任务，即变压器和TS的ASR和双向编码器表示的变压器。但是，ASR错误直接影响级联方法的输出概要的质量。我们提出了一个级联语音摘要模型，它对ASR错误具有强大，并且利用ASR生成的多个假设来衰减摘要摘要的效果。我们调查了几个方案来组合ASR假设。首先，我们建议使用由ASR系统提供的其后部值作为基于BERT的TS系统的输入来加权的子字嵌入向量的总和。然后，我们介绍了一种更一般的方案，它使用添加到预先训练的BERT模块的关注的融合模块来对齐并组合几个ASR假设。最后，我们在How2 DataSet上执行语音摘要实验和我们将使用本文发布的新组合的基于TED的数据集。这些实验表明，通过这些方案再培训基于伯特的TS系统可以改善总结性能，并且基于注意的熔融模块特别有效。

translated by 谷歌翻译

Bandit approach to conflict-free multi-agent Q-learning in view of photonic implementation

Hiroaki Shinkawa , Nicolas Chauvet , André Röhm , Takatomo Mihana , Ryoichi Horisaki , Guillaume Bachelier , Makoto Naruse

分类：人工智能

2022-12-20

Recently, extensive studies on photonic reinforcement learning to accelerate the process of calculation by exploiting the physical nature of light have been conducted. Previous studies utilized quantum interference of photons to achieve collective decision-making without choice conflicts when solving the competitive multi-armed bandit problem, a fundamental example of reinforcement learning. However, the bandit problem deals with a static environment where the agent's action does not influence the reward probabilities. This study aims to extend the conventional approach to a more general multi-agent reinforcement learning targeting the grid world problem. Unlike the conventional approach, the proposed scheme deals with a dynamic environment where the reward changes because of agents' actions. A successful photonic reinforcement learning scheme requires both a photonic system that contributes to the quality of learning and a suitable algorithm. This study proposes a novel learning algorithm, discontinuous bandit Q-learning, in view of a potential photonic implementation. Here, state-action pairs in the environment are regarded as slot machines in the context of the bandit problem and an updated amount of Q-value is regarded as the reward of the bandit problem. We perform numerical simulations to validate the effectiveness of the bandit algorithm. In addition, we propose a multi-agent architecture in which agents are indirectly connected through quantum interference of light and quantum principles ensure the conflict-free property of state-action pair selections among agents. We demonstrate that multi-agent reinforcement learning can be accelerated owing to conflict avoidance among multiple agents.

translated by 谷歌翻译

集体决策对于最近的信息和通信技术至关重要。在我们以前的研究中，我们在数学上得出了无冲突的联合决策，最佳地满足了玩家的概率偏好概况。但是，关于最佳联合决策方法存在两个问题。首先，随着选择的数量的增加，计算最佳关节选择概率矩阵爆炸的计算成本。其次，要得出最佳的关节选择概率矩阵，所有玩家都必须披露其概率偏好。现在，值得注意的是，不一定需要对关节概率分布的明确计算；集体决策的必要条件是抽样。这项研究研究了几种抽样方法，这些方法会融合到满足玩家偏好的启发式关节选择概率矩阵。我们表明，它们可以大大减少上述计算成本和机密性问题。我们分析了每种采样方法的概率分布，以及所需的计算成本和保密性。特别是，我们通过光子的量子干扰引入了两种无冲突的关节抽样方法。第一个系统允许玩家隐藏自己的选择，同时在玩家具有相同的偏好时几乎完美地满足了玩家的喜好。第二个系统，其物理性质取代了昂贵的计算成本，它也掩盖了他们的选择，因为他们拥有可信赖的第三方。

translated by 谷歌翻译